Approximate Policy Iteration for Closed-Loop Learning of Visual Tasks
نویسندگان
چکیده
Approximate Policy Iteration (API) is a reinforcement learning paradigm that is able to solve high-dimensional, continuous control problems. We propose to exploit API for the closed-loop learning of mappings from images to actions. This approach requires a family of function approximators that maps visual percepts to a real-valued function. For this purpose, we use Regression Extra-Trees, a fast, yet accurate and versatile machine learning algorithm. The inputs of the Extra-Trees consist of a set of visual features that digest the informative patterns in the visual signal. We also show how to parallelize the Extra-Tree learning process to further reduce the computational expense, which is often essential in visual tasks. Experimental results on real-world images are given that indicate that the combination of API with Extra-Trees is a promising framework for the interactive learning of visual tasks.
منابع مشابه
Closed-Loop Learning of Visual Control Policies
In this dissertation, I introduce a general, flexible framework for learning direct mappings from images to actions in an agent that interacts with its surrounding environment. This work is motivated by the paradigm of purposive vision. The original contributions consist in the design of reinforcement learning algorithms that are applicable to visual spaces. Inspired by the paradigm of local-ap...
متن کاملUnifying Value Iteration, Advantage Learning, and Dynamic Policy Programming
Approximate dynamic programming algorithms, such as approximate value iteration, have been successfully applied to many complex reinforcement learning tasks, and a better approximate dynamic programming algorithm is expected to further extend the applicability of reinforcement learning to various tasks. In this paper we propose a new, robust dynamic programming algorithm that unifies value iter...
متن کاملOn Temporal Evolution in Data Streams
The future of CiteSeer : CiteSeer[superscript x] p. 2 Learning to have fun p. 3 Winning the DARPA grand challenge p. 4 Challenges of urban sensing p. 5 Learning in one-shot strategic form games p. 6 A selective sampling strategy for label ranking p. 18 Combinatorial Markov random fields p. 30 Learning stochastic tree edit distance p. 42 Pertinent background knowledge for learning protein gramma...
متن کاملA multi-objective model for closed-loop supply chain optimization and efficient supplier selection in a competitive environment considering quantity discount policy
Supplier selection and allocation of optimal order quantity are two of the most important processes in closed-loop supply chain (CLSC) and reverse logistic (RL). So that providing high quality raw material is considered as a basic requirement for a manufacturer to produce popular products, as well as achieve more market shares. On the other hand, considering the existence of competitive environ...
متن کاملEfficient Approximate Policy Iteration Methods for Sequential Decision Making in Reinforcement Learning
(Computer Science—Machine Learning) EFFICIENT APPROXIMATE POLICY ITERATION METHODS FOR SEQUENTIAL DECISION MAKING IN REINFORCEMENT LEARNING
متن کامل